gUASIRANDOM LOAD BALANCING* 

TOBIAS PRIEDRICHt, MARTIN GAIRING*, AND THOMAS SAUERWALD§ 

Abstract. We propose a simple distributed algorithm for balancing indivisible tokens on 
graphs. The algorithm is completely deterministic, though it tries to imitate (and enhance) a 
random algorithm by keeping the accumulated rounding errors as small as possible. 

Our new algorithm surprisingly closely approximates the idealized process (where the tokens 
are divisible) on important network topologies. On d-dimensional torus graphs with n nodes 
it deviates from the idealized process only by an additive constant. In contrast to that, the 
randomized rounding approach of Priedrich and Sauerwald [11] can deviate up to r2(polylog(n)) 

and the deterministic algorithm of Rabani, Sinclair and Wanka [32] has a deviation of n(n^/''). 
This makes our quasirandom algorithm the first known algorithm for this setting which is 
optimal both in time and achieved smoothness. We turther show that also on the hypercube 
our algorithm has a smaller deviation from the idealized process than the previous algorithms. 

1. Introduction. Load balancing is an important requisitc for the eSicient uti- 
lization of computational resources in parallel and distributed systems. The aim is to 
reallocate the load such that at the end each node has approximately the same load. 
Load balancing problems have yarious applications, e.g.^ for scheduling [36], routing [5], 
and numerical computation [37, 38]. 

Typically, load balancing algorithms itcratively exchange load along edges of an 
undirected connected graph. In the natural dijjusion paradigm, an arbitrary amount 
of load can be sent along each edge at each step [30, 32]. For the idealized case of 
diyisible load, a popular difFusion algorithm is the first-order-scheme by Subramanian 
and Scherson [35] whose convergence rate is fairly well captured in terms of the spectral 
gap [26]. 

However, for many applications the assumption of divisible load may be invalid. 
Therefore, we consider the discrete case where the load can only be decomposed in in- 
divisible unit-size tokens. It is a very natural question by how much this integrality 
assumption decreases the efHciency of load balancing. In fact, hnding a precise quanti- 
tative relationship between the discrete and the idealized case is an open problem posed 
by many authors, e.g., [9, 11, 14, 15, 27, 30, 32, 35]. 

A simple method for approximating the idealized process was analyzed by Rabani, 
Sinclair, and Wanka [32]. Their approach (which we will call "RSW-algorithm" ) is to 
round down thc tractional flow of the idealized process. They introduce a very usetul 
parametcr of the graph called local divergence and prove that it gives tight upper bounds 
on the deviation between the idealized process and their discrete process. However, 
one drawback of thc RSW-algorithm is that it can end up in rather unbalanced states 
(cf. Proposition 6.1). To overcome this problem, Lriedrich and Sauerwald analyzed a 
new algorithm based on randomized rounding [11]. On many graphs, this algorithm 
approximates the idealized case much better than the approach of always rounding down 
of the RSW-algorithm. A natural question is whether this randomized algorithm can be 
derandomized without sacrihcing on its pertormance. For the graphs considered in this 
work, we answer this question to the positive. We introduce a guasirandom load halancing 
algorithm which rounds up or down deterministically such that the accumulated rounding 
errors on each edge are minimized. Our approach tollows the concept of quasirandomness 
as it deterministically imitates the expected behavior of its random counterpart. 
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Our Results. We focus on two network topologics: hypcrcubcs and torus graphs. 
Both have been intensively studied in the context of load balancing (see e.g., [11, 13, 
19, 31, 32]). We measure the smoothness of the load by the so-called discrepancy (see 
e.g. [9, 11, 15, 32]) which is the difference between the maximum and minimum load 
among all nodes. 

For d-dimensional torus graphs we prove that our quasirandom algorithm approxi- 
mates the idcahzcd proccss up to an additivc constant (Thcorcm 5.4). Morc preciscly, for 
all initial load distributions and time steps, thc load of any vcrtcx in the discrete process 
differs from the respective load in the ideahzcd proccss only by a constant. This holds 
even for non-uniform torus graphs with difFcrcnt sidc-lcngths (cf. Dctinition 5.1). For the 
uniform torus graph our rcsults arc to be compared with a dcviation of ri(polylog(n)) for 
the randomized rounding approach (Theorem 6.3) and f2(n^/'^) for the RSW-algorithm 
(Proposition 6.1). Hcncc clcspitc our approach is dctcrministic, it also improvcs ovcr its 
random counterpart. Starting with an initial discrcpancy of K, the idealized process 
reaches a constant discrepancy after 0{n'^/'^ \og{Kn)) stcps (cf. CoroUary 3.2). Hence 
the samc holds for our quasirandom algorithm, which makcs it the first algorithm for 
the discrete case which is optimal both in time and discrepancy. 

For the hypercube we prove a deviation of our quasirandom algorithm from the 
idcalizcd process of &{\ogn) (Thcorcm 4.2). For this topology we also show tliat thc 
deviation of the random approach is il{\ogn) (Thcorem 6.2) while the deviation of the 
RSW-algorith is fl{\og^ n) (Proposition 6.1). Again, our quasirandom algorithm is at 
lcast as good as thc randomizcd rounding algorithm and substantially bcttcr than the 
RSW-algorithm [32]. In particular, we obtain that for any load vector with initial dis- 
crepancy K, our quasirandom algorithm achieves a discrcpancy of at most logn after 
0(logn log^i^Tn)) steps. 

Our Technigues. Instead of analyzing our quasirandom algorithm directly, we exam- 
ine a new generic class of load balancing algorithms that we call bounded error diffusion 
(BED). Roughly speaking, in a BED algorithm the accumulated rounding error on each 
edge is bounded by some constant at all times. This class includes our quasirandom 
algorithm. 

The starting point of [32] and [11] as well as our papcr is to express the deviation 
from the idealized case by a certain sum of weighted rounding errors (equation (3.1)). 
In this sum, the rounding crrors are weighted by transition probabilities of a certain 
random walk. Roughly spcaking, Rabani et al. [32] estimate this sum directly by adding 
up all transition probabilities. In thc randomized approach of [11], the sum is bounded 
by Chernoff-type inequalities rclying on indcpendent rounding decisions. We take a 
completely different approach and prove that the transition probabilities bctwecn two 
fixed vertices are unimodal in time (cf. Theorem 4.9 for the hypercube). This allows 
us to upper bound the complete sum by its maximal summand (Lemma 3.6) for BED 
algorithms. The intriguing combinatorial property of unimodality is the heart of our 
proof and seems to be the main reason why we can outperform the previous approaches. 
Even though unimodality has a one-line deSnition, it has become apparent that proving 
it can be a vcry challenging task requiring intricatc combinatorial constructions or reiined 
mathematical tools (see e.g. Stanley's survey [34]). 

It turns out that this is also the case for the considered transition probabilities of 
torus graphs and hypercubes. The reason is that explicit formulas seem to be intractable 
and typical approximations {e.g. Poissonization [6]) are way too loose to compare con- 
secutive transition probabilities. For the d-dimensional torus, we use a local central limit 
theorem to approximate the transition probabilities by a multivariate normal distribution 
which is known to be unimodal. 

On hypercubes the above method fails as several inequalities for tlic torus graph 
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are only triic for coiistant d. Howcvcr, wc can cmploy thc aclditional symmctrics of the 
hypercube to prove unimodahty of the transition probabihties by relating it to a random 
walk on a weighted path. Somewhat surprisingly, this intriguing property was unknown 
before, although random walks on hypercubes have been intensively studied (see e.g. [6, 
21, 28]). 

We prove this unimodahty result by estabHshing an interesting result concerning 
hrst-passagc probabihtics of a ra.iiclom walk on paths with arbitrary transition prob- 
abilities: If the loop probabilities are at least 1/2, then the iirst-passage probabihty 
distribution can be expressed as a convolution of independent geometric distributions. 
In particular, this implics that thcsc probabilitics arc log-concavc. Rcducing thc random 
walk on a hypercube to a random walk on a weighted path, we obtain that the transition 
probabilities on the hypercube are unimodal. Estimating the maximum probabilities 
via a balls-and-bins-process, we hnally obtain our upper bound on the deviation for the 
hypercube. 

We believe that our probabilistic result for paths is of independent interest, as ran- 
dom walks on thc paths arc among thc most cxtcnsivcly studied stochastic proccsses. 
Moreover, many analyses of randomized algorithms can be reduced to such random walks 
(see e.g. [29, Thm. 6.1]). 

Related Work. In the approach of Elsasser and Sauerwald [8] ccirtain interacting 
random walks are used to reduce the load deviation. This randomized algorithm achieves 
a constant additive error between the maximum and average load on hypercubes and 
torus graphs in time 0{log{Kn)/{l — A^)), whcrc is thc sccond largcst cigenvaluc of 
the diffusion matrix. However, in contrast to our deterministic algorithm, this algorithm 
is less natural and more complicated (e.g., the nodes must know an accurate estimate of 
the average load). 

Aiello et al. [1] and Ghosh et al. [15] studied balancing algorithms where in each 
time step at most one token is transmitted over each edge. Due to this restriction, these 
algorithms take substantially more time, i.c, they run in timc at least linear in the 
initial discrepancy K. Nonetheless, the best known bounds on the discrepancy are only 
polynomial in n for the torus and f2(log^ n) for the hypercube [15]. 

In another common model, nodes arc only allowed to exchange load with at most one 
neighbor in each time step, see e.g., [11, 14, 32]. In fact, the afore-mentioned randomized 
rounding approach [11] was analyzed in this model. However, the idea of randomly 
rounding the fractional flow such that the expected error is zero naturally extends to our 
diffusive setting where a node may exchange load with all neighbors simultaneously. 

Quasirandomness describes a deterministic process which imitates certain properties 
of a random process. Our quasirandom load balancing algorithm imitates the property 
that rounding up and down the flow between two vertices occurs roughly equally often 
by a deterministic process which minimizes these rounding errors directly. This way, 
we keep the desired property that the "expected" accumulated rounding error is zero, 
but remove almost all of its (undesired) variancc. Similar concepts have been used for 
deterministic random walks [4] , external mergesort [2] , and quasirandom rumor spread- 
ing [7]. The latter work presents a cjuasirandom algorithm which is able to broadcast 
a piece of information at least as fast as its random counterpart on the hypercube and 
most random graphs. However, in case of rumor spreading the quasirandom protocol 
is just slightly faster than thc random protocol whilc thc; quasirandom load-balancing 
algorithm presented hcrc substantially outpcrforms its random counterpart. 

Organization of the paper. In Section 2, we give a description of our bounded error 
diffusion (BED) model. For a bcttcr comparison, we present some results for the previous 
algorithms of [11] and [32] in Section 6. In Section 3, we introducc our basic method 
which is used in Sections 4 and 5 to analyze BED algorithms on hypercubes and torus 
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graphs, respectively. 




2. Model and algorithms. Wc aim at balancing load on a conncctcd, undirected 
graph G = {y,E). Denote by deg(i) the degree of node i £ V and let A = A(G) = 
maxj£y deg(i) be the maximum degree of G. The balancing process is governed by an 
ergodic, doubly-stochastic diffusion matrix P with 

if {i,j}€E, 
otherwise. 

Let X*'*-' be the load-vector of thc vcrticcs at stcp t (or more precisely, after the completion 
of the balancing procedure at step t). The discrepancy of such a (row) vector x is 
ma,Xij{xi — Xj), and the discrepancy at step is called initial discrepancy K. 

The idealized process. In one time step each pair {i,j) of adjaccnt vcrticcs shifts 
divisible tokens between i and j. We have the tollowing itcration, = a;*^*"^-'^ and 
inductively, x^*^ = x^°^P*. Equivalently, for any edge {i,j} G E and step t, the flow 
from i to j at step t is Pijxf~^^ — Pj^i^^j^^K Note that the symmetry of P implies that 
for t — > OC', tM"^ convcrgcs towards thc unitorm vcctor (1/n, 1/n, . . . , l/n). 

The discrete process. There are differcnt ways how to handle non-divisible tokens. 
We deline the following bounded error dijjusion (BED) model. Let dcnote the inte- 
gral flow from i to j at time t. As = we have a;^*^ = xf~^^ — J2j- {i j}^E 

Let e-*j := [Pijx[*~^^ — Pj^ixj*~"^^) — ^>-*] be the excess load (or lack of load) allocated 
to i as a result of rounding on cdgc {i,j} in time step t. Note that for all vcrticcs i, 
a;|*'' = (a;(*^^'P)i + . |^ ^jg^ efj. Lct now A bc an uppcr bound for thc accumulatcd 

rounding errors (dcviation from thc idealized process), that is, | Xll=i^ij-| ^ ^ ^o^' 
t e N and {i,j} € E. All our bounds still hold if A is a function of n and/or t, but we 
only say that an algorithm is a BED algorithm if A is a constant. 

Our new guasirandom dijjusion algorithm chooscs for Pijxf^ ^ Pj^ia;^*'' thc flow 
from i to j to be either $^*j = [Pi,jxf^ - Pj,^xf\ or = \Pi,jxf^ - Pj,ixf^] 

such that I J2l=i 1 minimized. This yields a BED algorithm with A < 1/2 and 
can bc implcmcntcd with [log^ A] storagc pcr cdgc. Notc that onc can imaginc various 
other natural (deterministic or randomized) BED algorithms. To do so, the algorithm 
only has to ensure that the errors do not add up to more than a constant. 

With above notation, the RSW-algorithm uses $-*j = [Pjja;^*^ — Pj,ia;^*^J, provided 

that Pi j.Tl*-' ^ Pj_iX^^\ In othcr words, thc flow on cach cdge is always rounded down. 
In our BED tramework this would imply a A of order T after T time steps. 

A simple randomized rounding diffusion algorithm chooses for Pijxf^ > Pj,iX^p the 
flow $^*j as the randomized rounding of Pijxf^ — Pj,iX^p , that is, it rounds up with 
probabHity (Pjja;-*^ — Pj,iX^p) — [Pijx|*^ — Pj^i.Tj*^J and rounds down otherwise. This 
typically achicves an crror A of ordcr \fT' attcr T timc stcps. 

Handling Negative Loads. Unless thcrc is a lowcr bound on the minimum load of a 
vertex, negative loads may occur during the balancing procedure. In the following, we 
describe a simple approach to copc with this problem. 

Consider a graph G for which we can provc a dcviation of at most 7 from the 
idealized process. Let ^''^^ be the initial load vector with an averagc load of x. Then 
at the beginning of thc balancing procedure, each nodc gcncrates 7 additional (virtual) 
tokens. During the balancing procedure, these tokens are regarded as common tokens, 
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but at tho cnd thcy arc ignorcd. First obscrvc that sincc thc minimum load at cach node 
in the ideahzed process is at least 7, it rohows that at each step, every node has at least 
a load of zero in the discrete process. Since each node has a load of x + 0(7) at the end, 
we cnd up with a load distribution where the maximum load is still x + O^j) (ignoring 
the virtual tokcns). 

3. Basic method to analyze our quasirandom algorithm. To bound runtime 
and discrcpancy of a BED algorithm, we always bound the deviation between the ide- 
ahzed model and thc discrete model which is an important measure in its own right. 
For this, let xf^ denote the load on vertex i in step t in the discrete model and 
denote the load on vertex l in step t in the ideahzed model. We assume that the discrete 
and idealized modcl start with the same initial load, that is, a;'^^ = As derived in 
Rabani et al. [32], their difference can be written as 

4*^-er = E E <7''(Ph-n^)- (3-1) 

s=0 [i:j]eE 

where [i : j] refers to an edge {i,j} S E with i < j, where"<" is some arbitrary but 
fixed ordering on the vertices V. It will be suiJicient to bound equation (3.1), as the 
convergence speed of the idealized process can be bounded in terms of the second largest 
eigenvalue. 

Theorem 3.1 {e.g., [32, Thm. 1]). On all graphs with second largest eigenvalue in 
absolute value A2 = ^^(P), the idealized process with divisible tokens reduces an initial 
discrepancy K to £ within 




time steps. 

As A2 = 1— 6(log~^ n) for the hypercube and A2 = 1— 6(n~^/'^) for the d-dimensional 
torus [14], one immediately gets the following coroUary. 

COROLLARY 3.2. The idealized process reduces a,n initial discrepancy of K 
to 1 within 0(r?l'^ \o%{Kn)) time steps on the d-dimensional torus and within 
0(logn \og{Kn)) time steps on the hypercube. 

An important propcrty of all cxamined graph classes will be unimodality or log- 
concavity of certain transition probabilities. 

Depinition 3.3. Afunctionf: N ^ M>o is log-concave i//(i)^ > f{i-l)- f{i + l) 
for all i € N>o. 

Deeinition 3.4. A function f:N^M.is unimodal if there is a s G N such that 
f\x<s ds well as f\x>s o,re monotone. 

Log-concave functions are somctimcs also callcd strongly unimodal [23] . We summa- 
rize some classical results regarding log-concavity and unimodality. 

Fact 3.5. 

(i) Let f be a log-concave function. Then, f is also a unimodal function (e.g. [22, 
23]). 

(ii) Hoggar's theorem [18]: Let f and g be log-concave functions. Then their conuo- 
lution (/ * g){k) = X^*Lg f{i) g{k — i) is also log-concave. 

(iii) Let f be a log-concave function and g be a unimodal function. Then their con- 
uolution f * g is a unimodal function [23]. 

Our interest in unimodality is based on the fact that an alternating sum over a 
unimodal function can be bounded by their maximum. More preciscly, for a non-negative 
and unimodal function / : X — >■ M and to, - ■ ■ ,tk € X with ^o ^ • • • ^ tk, the following 
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holds: 



i=0 



< max/(a;). 



We generalize this well-known property in the following lemma. 

Lemma 3.6. Let f: X be non-negative with X CR. Let Ao, ... ,Ak e M and 



to, ■ ■ ■ ,tk G X such that to ^ 
local extrema, then 



Ei=o M for allO^a^ k. If f has £ 



i=0 



< {e + l)kiaaxf{tj). 



Proof. Let us start with the assumption that f{ti), ^ i ^ k, is monotone increasing. 
With f{t-i) := 0, it is easy to see that then 

|E-=o^i/(i.)| = |EtoE:=o^.(/(i.)-m-i))| 
= |E-=oEt,^. (/fe)-/fe-i))| 

^E-=o|/(i.)-/(i.-i)||E-=,^i| 

^E-=o|/(^.)-/(i.-i)|fc 

= k max/(fj). 
j=o 

The same holds if f{ti), ^ i ^ fc, is monotone decreasing. If f{x) has e local extrema, 
we split the sum in {£ + 1) parts such that f{x) is monotone on each part and apply 
above arguments. □ 



Random Walks. To examine the diffusion process, it will be useful to deiine a random 
walk bascd on P. For any pair of vcrticcs i,j, P* is the probability that a random walk 
guided by P starting from i is located at j at step t. In the toUowing Section 4 it 
will be useful to set Pi,j(t) := P*_j and to denote with ii,j{t) for i ^ j the hrst-passage 
probabilitics, that is, thc probability that a random walk starting from i visits the vertex 
j at step t for the hrst time. 

4. Analysis on the hypercube. We first give the dehnition of the hypercube. 
Depinition 4.1. A d-dimensional hypercube with n = 2"^ uertices has vertex set 
V = {0, 1}'' and edge set E = {{i,j} \ i and j differ in one bit}. 
In this section we prove the tollowing result. 

Theorem 4.2. For all initial load uectors on the d-dimensional hypercube with 
n vertices, the deviation between the idealized process and a discrete process with accu- 
mulated rounding errors at most A is upper bounded by 4Alogn at all times and there 
are load uectors for which this deuiation can be (logn)/2. 

Recall that for BED algorithms A = 0(1). With Thcorcm 3.1 it loUows that any 
BED algorithm (and in particular our quasirandom algorithm) reduces the discrepancy 
of any initial load vector with discrepancy K to C?(logn) within C(logn log(i^n)) time 
steps. 
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4.1. Log-concave passage time on paths. To provc Thcorcni 4.2, wc Ĕrst con- 
sider a discrete-time random walk on a path V = (0, 1, . . . , d) starting at node 0. We 
make use of a special generating function, called z-transform. The 2-transform of a 
function g:Ni-^ R>o is dcfincd by G{z) = 5^^o-9(*) ' ^'^^^ ^^"^ ^^'^^ ^^*^^ 

a convolution reduces to multiplication in the 2;-plane. Instead of the z-transform one 
could carry out a similar analysis using the probability generating /unction. We choose 
to usc thc z-transform here sincc it lcads to slightly simplcr arithmctic cxprcssions. 

Our analysis also uses the geometric distrihution with parameter p, which is defined 
by Geo(p)(i) = (1 - p)*"^p for t e N \ {0} and Geo(p)(0) = 0. It is easy to check that 
Geo(p) is log-concave. Moreover, the 2;-transform of Geo(p) is 



1=1 ^ ' 



For each nodc i € T^, let cti be the loop probability at node i and /3i be the upward 
probability, i.e., the probability to move to node i + 1. Then, the downward probability 
at node i is 1 — «j — We can assume that /3j > for alH e P \ {d}. We are interested 
in the first-passage probabilities fo,d(t). Observe that 

fo,d(i) = (fo,i * fi,2 * • • • * fd-i,d)(t). (4.1) 

In the following, we will show that fo,d(i) is log-concave. Indeed, we show a much stronger 
result: 

Theorem 4.3. Consider a random walk on a path V = (0,1,..., d) starting at 
node 0. If on > ^ for all nodes i G T, then fo,d can be expressed as conuolution of 
d independent geometric distributions. 

As the geometric distribution is log-concave and the convolution of log-concave func- 
tions is again log-concave (cf. Fact 3.5), we immediately get the following coroUary. 

COROLLARY 4.4. Consider a random walk on a path V = (0, 1, . . . ,rf) stariing at 
node 0. If > ^ for all nodes i then fo.d{t) is log-concave in t. 

Note that Theorem 4.3 follows directly from Thcorcm 1.2 of Fill [10] . As Theorem 4.3 
is a crucial ingredient for proving our main result Thcorcm 4.2 for the hypcrcube, we give 
a simplcr altcrnativc proof of thc statcmcnt. Whilc Fiirs proof is purcly stochastic, our 
proof is clcmcntary and bascd on functional analysis of thc z-transform. Our analysis for 
the discretc-time random walk should also bc compared with Keilson's analysis of thc 
contimious-timc proccss [22] . Thc continuous-time process was independently considered 
even carlicr by Karlin and McGregor [20]. 

Before proving the theorem, we will show how to obtain fo,d(i) by a recursive ar- 
gument. For this, supposc we are at nodc i E T \ {d}. The next stcp is a loop with 
probability a^. Morcovcr, the next subsequent non-loop move ends at i -|- 1 with proba- 
bility and at i - 1 with probability " "^^"^' alH e P \ {d}, 

fi,i-M(i) = , • Geo(l - ai){t) + — — ■ (Geo(l - a») * fi_i,i * fi,i+i)(t), 

1 - Q!j 1 - Q!i 

with corresponding 2;-transform 

T~ , . Bi 1 - «2 1 - /3i - a* 1 - I \ -r ( \ 

J-i,i+\\Z) = ■ 1 ■ • J-i-l,i{Z) ■ J-i i+i{Z). 

i — «i z — ai i — ai z — ai 
Rearranging terms yields 

^i,i+l{z) = 71 ^ TT , (4.2) 

z-ai-{l- l3i-ai)^Ti-i,i{z) 
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for all i e V \ {d}. So Ti^i+i{z) is obtained recursively with To,i{z) = ^_fi°_^^^ ■ Pinally 
the ^-transform of equation (4.1) is 

J^oA^) = J^oA^) ■ Ti,2{z) • . . . • Td-i,d{^)- (4.3) 

In the following, we prove some properties of Ti^i+\{z) {oi i gV \ {d}. 

Lemma 4.5. Except for singularities, Ti^i+i{z) is monotone decreasing in z, for all 
ieV\{d}. 

Proof. We will show thc claim by induction on i. It is easy to see that the claim holds 
for the base case {i = 0) sincc J^o.i^^^) = ^-(i^-Pa) ' ^^sume inductively that the claim 
holds for J^i-i^i{z). With 1 — ^ — «i > this directly implies that the denominator of 
equation (4.2) is increasing in z. The claim for Ti^i+i{z) follows. □ 

Lemma 4.6. For all i G V \ {d}, J^i^i+i{z) has exactly i + 1 poles which are all in 
the interual (0, 1). The poles of Ti^i+i{z) are distinct from the poles of Ti-\^i{z). 

Proof. Betore proving thc claims of the lemma, wc will show that Ti^i+i{0) > —1 and 
^i,i+i(l) = 1 for alH e -P \ {d}. Obscrvc, that J"o,i(0) = -{i°f3„) = > since 

ao > |. Also observe that J^o,i(l) = 1- Assume, inductively that J^i-i,i(0) > —1 and 
^i-i,j(l) = 1- Then with equation (4.2), j;,j+i(0) > = i-2a,-iSi - 

-1, since 1 - 2ai < 0. Moreover, 7^,^+1(1) = i_^._(f!_^._^.) = 1. Thus, 7^,^+1(0) > -1 
and J"i,,+i (1) = 1 for alH e P \ {d}. 

We will now show the claims of the lemma by induction. For the base case observe 
that ^0,1 (^;) = ^_(i!.ff„-) has one pole at 2; = 1 — /3o > and J^_i,o is not dehned. This 
implies the claim for i = 0. Suppose the claim holds for Fi-\^i{z) and let z\, Z2,--- Zi be 
the poles of J^i-i^i{z). Without loss of generality, we assume < zi < Z2 < ■ ■ ■ < Zi < 1. 
Let gi{z) be the denominator of equation (4.2), that is, 

9i{z) := z - ai- {1 - Pi - a,) • J^i-i^i{z). 

Observe that 

(i) gi{z) has the same set of poles as J^i-i^i{z), 

(ii) lim^-^-oo^i^-j) = -00, and 

(iii) Wm^^oo gi{z) = 00. 

By equation (4.2), J^i^i+i{z) has its poles at the zeros of gi{z). Lemma 4.5 shows that 
in each interval {zj,Zj+i) with 1 < j < i — 1, gi{z) is increasing in z. Using fact (i) 
this implies that gi{z) has cxactly one zero in each interval {zj, ^j+i). Thus J^i,i+{z) has 
exactly one pole in each interval {zj,Zj+i). Similarly, Lemma 4.5 with facts (i),(ii) and 
(iii) imply that J^i^i+i^^) has exactly one pole, say z', in the interval [—oo,zi) and one 
pole, say z" in the interval {zi, 00] . This implics that J^i.i+i{z) has exactly i + 1 poles 
which are all distinct from the poles of J-i-i i{z). It remains to show that 2;' > and 
z" < 1. 

Since J^i_i,i(0) > —1 and Mm^^-co J^i-i.i{z) = —0 it follows with Lcmma 4.5 that 
— 1 < J^i-i^i{z) < for all real 2; < 0. So gi{z) < for all real 2 < 0. It loUows 
that z' > 0. Similarly, since J^i_i,i(l) = 1 and lim2_>.oo ^i_i,i(2) = +0, it follows with 
Lemma 4.5 that < J^i-i.i{z) < 1 for all rcal z > 1. So gi{z) > for all real z > 1. It 
loUows that 2" < 1. This hnishes the proof of our inductivc stcp. Thc claim follows. □ 

Lemma 4.7. Let {bj^i)j^Q be the poles of Ti^i+i{z), i £V \ {d}, and define Pi{z) = 
n^=o(2 - Ki)- Then J-i,i+i(2) = A • for alli€V\ {d}. 
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Prooj. Our proof is by induction on ?'. For thc basc casc {i = 0), obscrvc that P_i(z) 1 
and thus J"o,i(-^) t;he desired form. Suppose the claim holds for Ti-\^i{z). Then 
equation (4.2) implies 



J^i,i+l{z) = 



z-ai- {1- l3i- a^)- A_i • p._l[l] 

Pi-Pi-i{z) 

{z - ai) ■ Pi-i{z) - (1 - ft - ai) ■ /3i_i • Pi-2{zy 



(4.4) 



Observe that {z — a^) ■ Pi-i{z) is a polynomial of degree i + 1 where the leading term 
has a cocthcient of 1. This also holds for thc dcnominator of cquation (4.4), since there, 
we only subtract a polynomial of order i — 1. By Lemma 4.6 we know that J^i^i+i{z) 
has exactly i + 1 real positive poles. It follows that the denominator of equation (4.4) is 
equal to Pi{z). The claim follows. □ 

We are now ready to prove Theorem 4.3. 

Proof of Theorem 4-3. By equation (4.3) and Lemma 4.7, we get 

^o,*) ^ n = n (« -W^^^^-^-U 

where {bi^(i-i)iZo poles of Td-i,d{z) as detined in Lemma 4.7 and = 

Y[i=o T-b~ri' Lemma 4.6, bi^d-i G (0,1) for all i. Now for each i the term 
z_fe*'''~^ is the z-transform of the geometric distribution with parameter 1 — bi^d-i, i-^-, 
Geo(l - 6i,d_i)(t). 

Thus, fo,d(i) can be expressed as the convolution of d independent geometric distri- 
butions 



fo,d(t) = Kd ■ [Geo(l - 6o,d-i) * Geo(l - 6i,d_i) * ... * Geo(l - bd-i,d-i)]{t). 

Moreover, since fo,d is a probabihty distribution over t and the convolution of probability 
distributions is again a probability distribution, we have K^ = 1. The theorem lollows. 

□ 

One should note that it follows from [10, Theorem 1.2] that the parameters 
{bi,d-i)iZo geometric distributions are the eigenvalues of the underlying tran- 

sition matrix. 

Recall that our aim is to provc unimodahty for the function Poj ^)- Using the 
simple convolution formula Po,^ = fo,j * Pj,j and the log-concavity of fo,j, it sulBces to 
provc that Pj j is unimodal (cf. Fact 3.5). In the following, we will prove that Pj,j is 
even non-increasing in t. 

Lemma 4.8. Let P be the {d + 1) x {d + l)-transition matrix defining an ergodic 
Markoy chain on a path P = (0, . . . , d). // Pa > ^ for all ^ i ^ d then for all 
^ i ^ P* j zs non-increasing in t. 

Proof. It is well known that ergodic Markov chains on paths are time reversible (see e.g. 
Section 4.8 of Ross [33]). To see this let tt = (tto, . . . , tt^) be the stationary distribution. 
Then for all < i < d — 1 the rate at which the proccss goes from -i to i -I- 1 (namely, 
7riPi,i+i) is cqual to the rate at which the process goes from i+1 to i (namely, TTi+iPj+i,,). 
Thus, P is time-reversible. 
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Onc uscful propcrty of a timc-rcvcrsiblc matrix is that its cigcnvahics arc all rcal. 
The Gersgorin disc theorem states that every eigenvalue Aj, < j < d, satisties the 
condition 



|A, -P,,| < 1-P 



for some < i < d. Since Pu > this directly implies that all eigenvalues are in the 
interval [0, 1]. 

It is wcU-known that there is an orthonormal base of M.''-'^^ which is formcd by the 
eigenvcctors vo, vi, . . . , Vd (see e.g. [17]). Then for any n-dimensional vcctor w G M.'^'^^, 
w = Y^^^^oiw^yj) Vj, where ( • , • ) denotes the inner product. Applying this to the i-th 
unit vector ej and using [• ]i to denote the i-th entry of a vector in R^^+^ we obtain 

d d 
3=0 j=0 



Thus, 



(a \ a u 

vj ) = E[^^i]i p* vj = J2[vj]i \]vj 
j=0 ' j=0 j=0 



and iinally 



d 



^li = [p*e,], = Y>,\^\)\v,\^ = Y.\) \v,\\, 

j=0 j=0 

which is non-increasing in t as [vj\i G M and < Aj < 1 for all < j < d. □ 

4.2. Unimodal transition probabilities on the hypercube. Combining 

Lcmma 4.8 and Thcorcm 4.3 and thcn projccting the random walk on the hypercube on 
a random walk on a path, we obtain the following result. 

Theorem 4.9. Let i,j G V be two vertices of a d-dimensional hypercube. Then, 
Pi,j{t) is unimodal. 

Proof. Wc usc thc following projection of a random walk on a rf-dimcnsional hypercube 
with loop probability 1/2 to a random walk on a path with d vertices, again with loop 
probability 1/2. The induced random walk is obtained from the mapping x that 
is, vcrticcs in {0, 1}'^ with thc samc numbcr of ones arc equivalent. It is casy to chcck 
that this ncw random walk is a random walk on a path with vertices 0,1, ... ,d that 
moves right with probability Afc = left with probability A*fc = and loops with 
probability i. (This proccss is also known as Ehrenfest chain [16]). 

Consider now the random walk on the path with vertex set {0, 1, . . . ,d} and let j be 
an arbitrary number with < j < rf. Recall that Po,j can be expressed as a convolution 
(cf. [16]) of P and f as follows, 

Po,j = fo,j * Pj,j- 

By Corollary 4.4, io.j(t) i*' log-concave. Moreover, Lemma 4.8 implies that 'Pjj{t) is 
non-incrcasing in t and hcncc unimodal. As the convolution of any log-concave function 
with any unimodal function is again unimodal (cf. Fact 3.5), it follows that Poj(t) is 
unimodal in t. 

Now fix two vcrtices i,j of the c?-dimcnsional hypercubc. By symmetry, we may 
assume that z = 0'' = 0. Conditioned on the event that thc projccted random walk is 
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locatcd at a vcrtcx with \j\i oncs at stcp t, cvcry vcrtcx with \ j\i oncs is cquaUy likcly. 
This gives Po,j(i) = Po,|i|i and therefore the unimodaUty of Po,|j|i(t) impUes 

directly the unimodaUty of Poj {t), as needed. □ 

With more direct methods, one can prove the foUowing supplementary result giving 
further insights into the distribution of Pij{t). As the result is not required for our 
analysis, the proof is given in the appendix. 

Proposition 4.10. Let i,j& V be two vertices ofthe d-dimensional hypercube with 
dist{i,j) > d/2. Then, Pi,j{t) is monotone increasing. 

4.3. Analysis of the discrete algorithm. We are now ready to prove our main 
result for hypercubes. 

Prooj of Theorem 4-2. By symmetry, it sufHces to bound the deviation at the vertex 
= 0"^. Hence by cquation (3.1) we have to bound 

|4*^-d*'| < |ElloE[.,]e£;elr^(Po.(^)-Po,,(s))| 

^\Ylfs=Q^[i:j]eE^\,j ^Po,i(s)| + I Z)s=0 S[rj]gE ^ij ■'Po,j(s)| 

^ Z)[i:j]G£; I Z)s=o ^Po,i(s)| + Z][rj]eB I I]s=o ■'Po,j(s)|. 

Using Theorem 4.9, we know that the sequences Po,i(s) and Poj (s) are unimodal in s 
and hence we can bound both summands by Lcmma 3.6 (where £ = 1) to obtain that 

14*) - e^*) I ^ 2A E[.,]e£ max*-i Po,.(s) + 2A T.[i:j]eE ^^4=0 ^o,j{s) 

= 2AdE.Gymax*-iPo,.(s). (4.5) 

To bound the last term, we view the random walk as the loUowing process, similar to a 
baUs-and-bins process. In each step t e N we choose a coordinate i G {1, . . . ,d} uniformly 
at random. Thcn with probabiUty 1/2 wc flip thc bit of this coordinate; othcrwisc we 
keep it (equivalently, we set the bit to 1 with probability 1/2 and to zero otherwise). 

Now we partition the random walk's distribution at step t according to the number 
of difFcrcnt coordinates chosen (not ncccssarily Uippcd) until stcp t. Considcr Fa.x{t) for 
a vertex x G {0, l}'^. Note that by the symmetry of the hypercube, Po,x(i) is the same 
for all X £ {0, 1}'' with the same |a;|i. Hence let us fix a value i with < ^ < d and let us 
consider Po,e{t) which is thc probability for rcaching thc vcrtcx, say, l^0<^-^ from = 0'' 
within t steps. Since (i) the k chosen coordinates must contain the i ones and (ii) all k 
chosen coordinates must be set to the correct value, we have 

Po,e{t) = Yfk=i Pi' bxactly k coordinates chosen in t steps] • 2"*= (^I^)/^^) ■ (4.6) 
Using this to estimate Po,i(s), we can bound equation (4.5) by 
- I <; 2A rf ^to (') niax~ o ^o,e{s) 
= 2AdEto (') niax-o 

Yk^l P'^ [exactly k coordinates chosen in s steps] • ■ 2"*^ 

<9A/7'V^'* mnY<i ()=-<;) {l) o-fc 

^ zi\ a l^g^o maxj,^^ — — z . 

The fraction in the last term corresponds to the probability of a hyper-geometric distri- 
bution and is therefore trivially upper-bounded by 1. This allows us to conclude that 

|4*^ - I < 2A rf Eto < 4Ad 
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and the Rrst claim of thc thcorcm follows. 

The second claim tollows by the tollowing simple construction. Dehne a load vector 
a;(°) such that := d for all vertices v = {vi,V2, ■ ■ ■ ,Vd) € {0, 1}'^ with vi — 0, and 
a;i°^ := otherwise. Then for each edge S E with = ii ^ j\ the fractional flow 
at step 1 is (J^ijxf^ — 'Pijx'"^^^ ~ +i. Sincc in thc hrst timc stcp no rounding crrors 
have been occurred so far, each edge is allowed to round up and down arbitrarily. Hence 
we can let all these edges round towards j, i.e., := 1 for each such edge {i,j} € E. 
By definition, this implies for the corresponding rounding error, e\j = — 5- Moreover, 
we have the following load distribution after step 1. We have xi^^ = for all vertices v 
if Vi = 0, and xi^^ = d otherwise. Similarly, the fractional flow for each edgc {i,j} G E 
with = n ^ n is (P»,,.xl"^ - P^,jxf) = -i. Sincc e^] = -\, \ Y^^^e^\ wiU bc 

minimizcd if e^^ = Hence we can set ^^^^ := —1. This implics that wc cnd up in 
exactly the same situation as at the beginning: the load vector is thc samc and also the 
sum over the previous rounding crrors along cach cdgc is zero. We concludc that thcrc 
is an instance of the quasirandom algorithm for which cc^*-* = 2;(*™od2)^ This gives the 
claim. □ 

5. Analysis on c?-dimensional torus graphs. We start this section with the 
formal dehnition of a rf-dimensional torus. 

Depinition 5.1. A d-dimensional torus T{ni,n2, ■ ■ ■ ,nd) with n = n\ ■ n^ ■ ■ ■ ■ ■ 
na yertices has vertex set V = {0, 1, . . . , ni — 1} x {0, 1, . . . , — 1} x ... x {0, 1, ... , 

nd— 1} and every vertex (ii, 12, . ■ . ,id) G V has 2d neighbors {{ii ± 1) mod rii, i^, . . . , id), 
{ii, (i2±l) mod ^2,^3, • . • ,id), ■ ■ ■ , {ii,i2, ■ ■ ■ ,id-i, (*d±l) mod n^). Hencejorth, we will 
always assume that d = 0(1). We call a torus unijorm if n\ = n^ = ■ ■ ■ = nd= \fn. 

Without loss of generality we will assume in thc remaindcr that n\ ^n^ ^ ■ ■ ■ ^Ud- 
By the symmetry of the torus this does not restrict our results. 

Recall that A2 denotes the second largest eigenvalue in absolute value. Before we 
analyze thc dcviation between the idealized and discrete process, we estimate (1 — A^)"^ 
for general torus graphs. 

Lemma 5.2. For a d-dimensional torus T = T{n\, n^, ■ ■ ■ , n^), (1 — A^)"^ = 6 (n^) . 

Proof Pollowing the notation of [3], for a fc-regular graph G, let L(G) be the matrix 

given by L„,„(G) = 1, L„,t,(G') = if {u,v} G E{G) and Lu,v{G) = otherwise. Let 
Gq be a cyclc with q vertices. As shown in [3, Example 1.4], the eigenvalues of L(Cq) 
are 1 — cos (^) where O^r^g— 1. In particular, the second smallest eigenvalue of 
L(Cg) dcnotcd by r is given by 1 — cos (^). 

Let X denote the Cartesian product of graphs, that is, for any two graphs G\ = 
{V\,E\), G2 = {¥2,^2) the graph G := d x G2 with G = {V, E) is dehned ^^^ = ^1X^2 
and 

E :={{{u\,U2),{v\,U2)): U2 e V2A{u\,v\} € E\}iJ 
{{{u\,U2), {u\,V2)) : ui e Vi A {«2,^2} e E'^}. 

It is straightforward to generalize this deSnition to thc Cartesian product of more than 
two graphs and it is then easy to check that T{n\, n^, ■ ■ ■ , n^) = G^ x Gn^ x . . . x C„j. 
The following theorem expresses the second smallest eigenvalue of the Cartesian product 
of graphs in terms of the second smallest eigenvalue of the respective graphs. 

Theorem 5.3 ([3, Theorem 2.12]). Let G\,G2, ■ ■ ■ ,Gd be d graphs and let 
T\,T2, . . . ,Td be the respective second smallest eigenvalue of L(Gi), L^G^), . . . , L(Gd). 
Then the second smallest eigenvalue r o/L(Gi x G2 x . . . x Gd) satisfies t = 2 min^^i ''"fe- 
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Applying this thcorcm to our sctting, it fonows that thc sccond sniaUcst cigcnvaluc t 
of L(r) is T = i (1 - cos(^)). As Ud ^ wc havc cos (|^) = 1 - Using 

this and the fact that d is a constant, we obtain that t = As T is a /c-regular 

graph, the transition matrix P(T) can be expressed as P(T) = I — iL(T). This imphes 
for tlic sccond smallcst cigcnvahic of L(T), t, and thc sccond largcst eigenvalue of the 
transition matrix P(T), A2, that A2 = 1 — \t. Hence A2 = 1 — S(^), which completes 
the proof. □ 

Note that the corresponding results of [11, 32] only hold for unitorm torus graphs 
while the following result for our algorithm holds for general torus graphs. 

Theorem 5.4. For all initial load vectors on the (not necessarily uniform,) 
d-dimensional torus graph with n uertices, the deviation between the idealized process 
and a discrete process with accumulated rounding error at most A is 0{A) at all times. 

For any torus graph, wc know that (1 — A^)""'^ = Q{n'^) by Lcmma 5.2. With 
Theorem 3.1 it follows that any BED algorithm (and in particular our quasirandom 
algorithm) reduces the discrepancy of any initial load vector with discrepancy K toO{l) 
within O^n^ \og{Kn)) time steps (for uniform torus graphs, this number of time steps 
is 0{n'^/'^ \og{Kn))). 

Proof of Theorem 5.4- By symmetry of the torus graph, we have Pij = Po,i-j. Hence 
we set Pj = Po,i- We will first reduce the random walk Pj^j on the iinite rf-dimensional 
torus to a random walk on thc intinitc grid ^^^, both with loop probability 1/2. Let Pij 
be the transition probability from i to j on deiined by Pjj = l/(4(i) if |i — j|i = 1, 
Pi,j = 1/2, and otherwise. For i = {i\, . . . ,id) € V we set 

H{i) := (ii + m Z, ^2 + n2 Z, . . . , + rid Z) C lf. 

With Pj := Po,i, we observe 

keH{i) 

for all s > and i&V. We extend the detinition of ei,j in the natural way by setting 

efe,^ := ei,j for all i,jGV and k G H{i), i e H{j). 

Let ARR = {±ui | £ e {1, . . . , d}} e with U£ being the ^-th unit vector. ToUowing 
equation (3.1) and using the fact that by symmetry it sutEces to bound the deviation at 
the vertex := 0'', we get 



'^O ^ ^O^"^^-'^-^ ^-^ ^iA+^^^^i ^i-l-i) 

s=Oiey zeARR 

= ^EE E ^i:^f E n- E p 

s=OieVzeARR \keH(i) eeH{i+z) 

^ 9 E E E ( Pfc ~ p 



k+z 



9E E E ^fe.fe-t-z (-Pfe ~ ^k+z) 



2 

ieV zeARR keH(i) s=0 
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As = Uiey ^{i) a disjoint union, we can also write 



t-i 



(5.1) 



feeZ'' «eARR s=o 



We now carefully break down the sums of equation (5.1) and show that each part can 
be bounded by 0{A). For this, our main tool will be Lemma 3.6. As we cannot prove 
unimodality of (Pj. — P^^i^^) directly, we will use an appropriate local ccntral limit 
theorems to approximate the transition probabilities P^ of with a multivariate normal 
distribution. To derive the limiting distribution P^ of our random walk Pij, we follow 
Lawlcr and Limic [25] and lct X = {Xi, . . . ,Xd) bc a ^''-yalued random variable with 
Pr [X = z] = l/(4d) for aU z G ARR and Pr [X = O'^] = 1/2. Observe that E [XjXk] = 
for j ^ k since not both of them can be non-zero simultaneously. Moreover, E = 
^(—1)^ + ^(+1)^ = for all 1 < j < d. Hence the covariance matrix is 



r := 



E[X,Xk] 



From Eq. (2.2) of Lawler and Limic [25] we get 



= {2d)-^I. 



1 



exp i 



X ■ k 



exp 



X ■ rx 



where t : 



-1 denotes the imaginary unit. With this we can further conclude that 



1 



1 



/ exp ( i ^ 



1 



exp i 



k 
X ■ k 



X ■ Vx 



Ad 



j^d ( Ad 



2d 



x\\l - 2t —;=, X ■ k 



To evaluate the integral we complete the square, which yields 



exp 



^ 2d 
■ 2i —= X - k- 

/s 



4£ 

s 





2d , 




X — i —p=,k 


\ 4d 





+ ■ 



(5.2) 



(5.3) 



By substituting z = x — i -^k we get 



exp 



1 



2d , 
i—j=.k 



d''^ 



/ exp(- 



exp 




dzi 



exp 



d-l 



exp 



4d ( ^ 



exp ( --^^d ] dzd ] dzd-i ... dzi 



dzd-i .. ■ dzi 



i=l 



2Vnd 



(5.4) 



Combining equations (5.2), (5.3), and (5.4), we get 



1 



d/2 



exp ||fc| 



(2VTuTy 



dy f 



-d\\k\\l 



(5.5) 



It follows directly from Claims 4 and 5 of Cooper and Spencer [4] that for all k e 
z e ARR, 



= 0'yM2^'^^^^) for all s, 



{s i->-P| — ^k+z) o^^ly ^ constant number of local extrema. 
This gives the intuition that by approximating (P^ — Pfe+z) with (P^ 



k+z) ' 



(5.6) 
(5.7) 



we can 



bound equation (5.1) for sulSciently large k and s by Lemma 3.6. This approximation 
is made precise by the following local central limit theorems. Theorem 2.3.6 of Lawler 
and Limic [25] gives for all fc e Z<^, ^ e ARR, s > 0, 



k+z 



)-(P^-P^+,)|=0(s-(W). 



(5.8) 



We first separate the case A; = in equation (5.1). With := l/ \ {0'^} 

(5.9a) 



(*) At) 



1 







f^ '\ < - 
4o I ^ 2 



t-i 



1 

+ 2 



4!o+i (Po - Po+z) 

eARR s=0 

t-1 



feeZ^(, z£ARR s=0 



(5.9) 



(5.96) 



Now we can apply the local central limit theorem given in equation (5.8) to (5.9a) and 
get 



(5.9o) 



t-i 



X X] %0+z (Po - P( 



0+zJ 



zGARR s=0 



-(d+3)/2 



zGARR s=0 



= 0(A), 



wherc thc last equality follows by Lemma 3.6 combined with equation (5.7) and the 
property | J2l=i 4j I ^ 
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We proceed by fixing a cutoff point T{k) := j^^^^pj^, k € Z^0' °^ innermost sum 
of (5.9b) for some suiHciently small constant C > 0, 



(5.10a) 



(5.96) < 



T(fc) 



+ 



X! X! X] ^k,k+z (Pfc ~ ^k+z) 

ezj(, zeARR s=i 

t-1 

^i./s+i (^^ ~ -^/s+^) 



fceZ^oZeARR 5=T(fc) 



(5.106) 



(5.10) 



Note that the summand with s = is zero and can be ignored. The iirst summand 
(5.10a) can be bounded by 



(5.ioa)=of j: j: e(p: +?:+.)). 

\ fcez^o zeARR s=l / 



(5.11) 



It is known form Lawler [24, Lem. 1.5.1(a)] that for random walks on inSnite grids, 
^l|fe||2^A\/TPfe ~ 0{e~^) for all s > and A > 0. Hence also 

Pfe = 0(exp ( - ||fc||2/\/F)) for all s > 0, ib e 2"^. 

With this we can now bound thc tcrm (P^ + P^+^) from equation (5.11). For < s < 
T{k), A: e Z^o, 2 G ARR, and suiRciently smah C > 0, 



Pfc+Pfc+z=(^ exp 



0[ cxp 



2- 11^112 



= o(^exp(^- 



HWkh) 



(l|fc||2-l) 



= o 



-(d+4) 



Plugging this into equation (5.11), we obtain that 



(5.10a) = o( E T{k)m^^'+'^]=0[ 

\ fcezi„ 



l|_(,+2) =0(1). 



To bound (5.10b), we approximate thc transition probabilities of 'E'^ with the mul- 
tivariate normal distribution of equation (5.5) by thc local central limit theorem stated 
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in equation (5.8), 
(5.106) 



t-i 



X! m X] ^k,k+z (Pfe ~ ^k+z) 



ke^% zeARR s=T{k) 

t-1 



X! X] X! ^fe,fe+l (Pfe ~ ^k+z) - (Pfe - PI+^) 



feeZ^o zeARR s=T(fe) 



(5.12a) 



t-1 



^i.^+l (-^^ ^k+z) 



+ 



^% ^e^^R'^ s=T(fe) 
t-1 



E E E «^c 



feez^o ^eARR s=T(fe) 



(5.12) 



(5.126) 

We can bound the second term (5.12b) by 



(5.126) = o[ci J2 X 5"^'+'^/' 

= o( Tiky^^^+^yA 

Vfcezi„ / 



7^0 



As there are constants C" > and e > such that In'' ^dl^lh) ^ C" ||A:||2 " for all 



k e we obtain that 



(5.126) = o( ^ 

V feez^o 



-(d+e) 



To see that this can be bounded by 0{1), observe that with N^g := N<^ \ {0''}, 



(d+£)/2 



feeN^o 



12 

By convexity of a; a:^, A;f + • • • + /c^ ^ ^(^i + ^'i)'^' '^^ ^^^^^ 



(5.126) = ^ (fci + ... + fc,)-(<^+^) =0 £ J2 



-(d+e) 



> feeN^o 



>x=i keN'' 
l|fc||i=^ 



Oi ^^^'^-^•^^-('^+^) =0^: 



-a+e) 



0{l). 
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To Ĕnally bound (5.12a), we apply equation (5.7). We also usc that — Pfe+2 can be 

bounded by C(!|fc||2 ^''^^^) according to equation (5.6). As |X^*=ie-j| ^ A, applying 
Lemma 3.6 yields 

(5.12a) = of ^ ^ A ^a^^^(p^-P^g] 



= o( E E Aii^ 

Vfcez^o zeARR 

= o(Ad E 
V feez4 



\-id+l) 
12 



= 0(A). 

Combining all above bounds, we can conclude that - ?o*^ | = ^'(A), meaning that 
the deviation between the ideaUzed process and the discrete process at any time and 
vertex is at most 0{A). □ 

6. Lower bounds for previous algorithms. For a better comparison with pre- 
vious algorithms, this section gives lower bounds for other discrete diffusion processes. 
First, we observe the following general lower bound on the discrepancy for the RSW- 
algorithm. 

Proposition 6.1. On all graphs G with maximum degree A, there is an initial load- 
yector a;(°) with discrepancy Adiam(G) such that for the RSW-algorithm, x^*^ = x^*~^^ 
for all t € N. 

Proof. Fix a pair of vertices i and j with dist(i, j) = diam(G). Dehne an initial load- 
vector x^'^'' by 

4°^ := dist(fc,i) • A. 

Cloarly, tho initial discropancy is ^'"p ~~ xf^ = Adiam(G). We claim that a;^"^-' = a;^°^. 



Consider an arbitrary edge {r,s} € E{G). Then, 



|p r(l)-P r(l)| - — It(°) - tW| < — A - i 



Hence the integral flow on any edge {r, s} G E{G) is [^J = and the load-vector remains 
unchanged. The claim follows. □ 

In the remainder of this section we present two lower bounds for the deviation 
between thc randomizcd rounding difFusion algorithm and thc idoahzod proccss. First, 
we prove a bound of f2(logn) for the hypercube. Together with Theorem 4.2 this imphes 
that on hypercubes the quasirandom approach is as good as the randomized rounding 
diffusion algorithm. 

Theorem 6.2. There is an initial load vector of the d-dimensional hypercube with 
n = 2"^ uertices such that the deuiation of the randomized rounding diffusion algorithm 
and the idealized process is at least logn/4 with probability 1 — n^^^h 

Proof. Wc dchnc an initial load vcctor .t'^") as tollows. For cvcry vcrtcx v = 
{vi,V2, ■ ■ ■ ,Vd) G {0,1}'^ with vi = wc sot x^^ = ^i^^ = and if vi = 1 wc sot 
= ^i^^ = c?. Henco, tho ideahzed process will send a Aow of d/{2d) = 1/2 from 
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every vertcx v = (1, «2, ws, . . . , Vd) & {0, 1}'' to (0, V2, «3, . . . , Vd)- Hence for the idealized 
process, ^i^^ = (1/2) d, that is, all vertices have a load of (1/2) d after one step and the 
load is perfectly balanced. 

Let us now consider the discretc proccss. Lct Vo be the set of vertices whosc bitstring 
begins with 0. Considcr any node v E Vo. Note that all neighbors of v havc a load 
of 1 and the integral flow from any of those neighbors equals 1 with probability 1/2, 
independently. Hence the load of v in the next step is just a binomial random variable 
and using the fact that (^) ^ {r/s)^ we obtain 



Pr 



^ Pr 



d \ /4\(3/4)d 



for some constant C > since d = log^ n. As the maximum degree of the graph is logn 
and thc sizc of Vo is n/2, it follows that thcrc is a subsct 5 C Vo of sizc V, ( ) in 
the hypercube such that every pair in S has distance at least 4. By construction, the 
respective events xl^^ > (3/4)rf are independent for all vertices v £ S. Hence 



Pr 



3v e S: a;W ^^d 
4 



where < C" < C is another constant. This means that with probability at least l — n~^ 
the load at vertex u at step 1 will be at least (3/4) d in the discrete process, but equals 
(1/2) d in the idealized process. This completes the prooL □ 

It remains to give a lower bound for the deviation between the randomized rounding 
and the idealized process for torus graphs. The following theorem proves a polylogarith- 
mic lower bound for the randomized rounding algorithm which should be compared to 
the constant upper bound for the quasirandom approach of Theorem 5.4. Similar results 

can also bc dcrivcd for non-uniform torus graphs. 

Theorem 6.3. There is an initial load vector of the d-dimensional unijorm torus 
graph with n vertices such that the deviation between the randomized rounding diffusion 
algorithm and the idealized process is f2(polylog(n)) with probability 1 — o(l). 

Proof. Lct n bc a suHicicntly largc intcgcr and T be a c?-dimensional torus graph with 
n vertices and sidc-length G N. Let Bk{u) := {v & V: \\v~u\\oo ^ k} and dBk{u) := 
{v G V: \\v — u\\oo = k}. For every vertex v G V^(T), we define \B£/2{v)\ = l'^ = 
(logn)^/4 with l := (logn)^/*^'*''^ where we assume w.l.o.g. that l is an odd integer. For 
l' := (logn)2/(3''), detine a set 5 C y by 

S:={{xr^ ,X2^ ,...,Xd^) I \ ^Xx,X2,...,Xd<^l(! -1), 

that is, every pair of distinct vertices in S has a coordinate-wise distance which is a 
midtiplc of Notc that \S\ = n{n/l"^). Dcfinc thc initial load vcctor as = ^f ^ := 
2d ■ max{0,i!/2 — dist(i, S)}, i £ V. Clearly, the initial discrepancy is K = 2d- 1/2. 

The idea is now to decompose T in smaller subgraphs centered around s G S, since 
the upper bound on the convergence rate given by Theorem 3.1 has a strong dependence 
on the size of the graph. Then we relate the simultaneous convergence on each of 
the smaller graphs to the convergence on the original graph. An illustration of our 
decomposition of T can be found in Figure 6.1. 

Fix some s G S. Then thc subgraph induced by the vertices Bg/^^s) is a 
d-dimensional grid with exactly n' := (logn)^/"' vertices. Let T' = T'(s) dcnotc tlie 
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Fig. 6.1: Overview of the decomposition of T into various T'(s) for the two-dimensional case 
d = 2. Thc inncr rcctanglcs rcprcscnt thc various smallcr grids T'(s) with s £ S. The 
darkness indicates the amount of the initial load. Note that the initial load of vertices 
outside the T'(s)'s is 0. 



corresponding d-dimensional torus graph with the same vertices, but additional wrap- 
around edges between ycrticcs of dBg/^is). W.l.o.g. we assume that the side-length -y/n 
of T is a multiple of the side-length £ of T'(s). Let P' be the difFusion matrix of T'(s). 

Let us denote by ^'*^°'' (x'^''-') the projection of the load vector (x(°)) from T onto 
T'(s). By CoroUary 3.2, the idealized process reduces thc discrcpancy on T'(s) from K = 
(logn^i/C^-i)/^ to 1 within to := 0{{n'f/'^ log{Kn')) = 0(loglog(n) (logn)!/^^^)) time 
steps. We now want to argue that this also happens on the original graph T with 
n yertices. Note that the convergence of the idealized process on T'(s) implies 

||^'(*o) - ^ll^ = ||P'*°^'(0) _ ^ll^ < 1. (6.1) 

Purthermore, note that the average load ^' in each T'(s) satisSes 

f < 2d • £/4. 

Our next observation is that for any two vertices u,v G T'(s), 

P^. < P% (6-2) 

as a random walk on T'(s) can be expressed as a projcction of a random walk on T (by 
assigning each vertex in T'(s) to a set of vertices in T). With the observations 
. forveT'(s): d"^ = Cu'^"^ 

• for G T and £/2 < dist(?;, s) ^ to: = (as to = o{£' - £/2)), 

• for w e T and dist(t;, s) > to: P*% = 0, 
we obtain for any vertex v G B^/2(s), 

t(to) = Cp(to) . t(0)\ ^^f{0)p{to)^ t'(0)p(to)_ 

iieT iigT'(s) 

By first applying equation (6.2) and then equation (6.1), we get 

veJ'{s) 
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This mcans that thc idcahzcd process achicvcs aftcr to timc stcps a good balancing at 
s. On the other hand, the discrete process may fail within to time steps if there is an s 
such that all edges in T'(s) round towards s at all time steps t ^ to. (Note that by 
construction, no load from anothcr T'(s'), s' G S \ {s}, can reach T'(s) within stcps, 
since the distance between any vertex in T'(s) and T'(s') is £' — 2£ ^ to.) Moreover, by 
definition of e {0,2d} if {u,v} G E{T). Hence the £ractional flow in 

the first step is e {0, |} and for fixed s the probability that Xu^ = Xu^ for all u e T'(s) 
is at lcast ^^l^*/^^'*)! . By induction, for fixed s the probability that x^^ = Xu°^ holds for 
all u G T'(s) is at least 



As we have \S\ = Sl{n/£"^) = O(poly(n)) independent events, it follows that there is at 
least one s G S with xi*°^ = xi^^ = £/2- 2d with probability 



where C > is somc constant. If this happcns, thcn thc deviation between the discrete 
and idealized process at vertex s G S' at step to is at least 



7. Conclusions. We propose and analyze a new deterministic algorithm for bal- 

ancing indivisiblc tokcns. By achieving a constant discrepancy in optimal time on all 
torus graphs, our algorithm improvcs upon all prcvious dctcrministic and random ap- 
proaches with rcspcct to both running time and discrcpancy. For hypcrcubcs wc provc a 
discrepancy of Q{\ogn) which is also signihcantly better than the (deterministic) RSW- 
algorithm which achicvcs a discrepancy of fl{\og^ n). 

On a concrctc lcvcl, it would be interesting to extend these results to other network 
topologies. From a higher perspcctive, our new algorithm providcs a striking example 
of quasirandomness in algorithmics. Devising and analyzing similar algorithms for other 
tasks such as routing, scheduling or synchronization remains an interesting open problem. 
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Appendix A. Proof of a supplementary result. 

In order to prove Proposition 4.10, we first note the following elementary lemma. 

Lemma A.l. Let {ak)f^i, {bk)t=n (cfc)fc=i be three positive sequences such that 
(i) for all j € [1, d], Yfk=j < ELj h, 

(ii) Ck is monotone increasing in k. 
Then for all j G[l,d], J2k=j • Cfe < J2k=j h-Ck. 

Proof. Deline i := d — j + 1. We will show that 

d d 

^ Ok-Ck^ ^ bk-Ck (A.l) 

k=d-i+l k=d-i+l 

for all i € [1, d]. Our proof is by induction on the number of summands i € [1, rf]. The 
claim is trivial for z = 1 and i = 2. Assume inductively that (A.l) holds for all i > i' 
and for all sequences satisiying the conditions of the lemma. We will show the claim for 
i = i' + 1, i.e., 

d d 

ak- Ck ^ ^ bk- Ck. 

k=d—i' k=d—i' 

DeĔne two shorter sequences (a^)^~^ and (6^)^"}. as follows: 
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• ai = Gk for k < d — 1 and , := ad-i H — —dd 

• = 6fe for /c < d - 1 and b'^_-^ := + 

We will show that the sequences (a^)^!^' (^fc)fc=i' (cA)fe=i satisfy the conditions of the 
lemma. Since {ck)k remained unchanged it suiHces to show that for all j' G [l,d — 1], 

d-l d-l 
k=j' k=j' 

or equivalently (using the deiinition of and b'k) 

K — J K — J 

By the first assumption of the lemma we have 

d d 

^(^k^^ bk- 

k=j' k=j' 

Moreover, since ad ^ bd and > 1, we have 

C(i— 1 

l)ad<(^-l]bd. 



bd-i + -^bd 

Cd-l 



,Cd-l / \Cd-l 

Thus, {a'k)'kZi, {b'k)'k=i' ick)kZi satisfy the conditions of the lemma. By induction hy- 
pothesis on those sequences and for i' summands, we have 

d-l d~l 

Yl ^k-ck^ Yl ^k ■ 

k=d—i' k=d—i' 

Plugging in the deiinition of and 6^ we tinally obtain 

d-2 , s d-2 

ak-Ck + Cd-i ( ad-i H ) < bk-Ck + Cd- 

k=d—i' k=d—%' 

which is precisely the induction claim for i' + 1. The lemma follows. □ 

Proposition 4.10. Let i,j € V be two yertices of the d-dimensional hypercube with 
dist(i,j) > d/2. Then, Pi,j{t) is monotone increasing. 

Proof. Fix an arbitrary step t £ N. By symmetry, it suiEces to prove that Po,x{t) ^ 
Po,x(i + 1) where x G {0,1}'* with \x\ ^ d/2. Pirst note that for all j G [\x\,d], 
Ylk=j Pr[exactly k coordinates chosen in t steps] ^ Yk=j Pr[exactly k coordinates cho- 
sen in t + 1 steps] since the distribution of choscn coordinates after t + 1 steps clearly 
dominates the distribution of chosen coordinates after t steps. Observe that for any 
ja;] > d/2 the function f{k) := 2~^ (felj^j)/(fe) ^s monotone increasing in ]a;] < A; < d. 
This can bc verificd by showing that f{k)/f{k — 1) ^ 1 for any k with ]x] < fc ^ d. 
This allows us to apply Lemma A.l to equation (4.6), giving that Po,x{t) ^ Po,x(^ + !)• 
Hence the proposition follows. □ 
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